Confidence estimation for t-SNE embeddings using random forest

نویسندگان

چکیده

Abstract Dimensionality reduction algorithms are commonly used for reducing the dimension of multi-dimensional data to visualize them on a standard display. Although many dimensionality such as t-distributed Stochastic Neighborhood Embedding aim preserve close neighborhoods in low-dimensional space, they might not accomplish that every sample and eventually produce erroneous representations. In this study, we developed supervised confidence estimation algorithm detecting samples embeddings. Our generates score each an embedding based distance-oriented random forest regressor. We evaluate its performance both intra- inter-domain compare it with neighborhood preservation ratio our baseline. results showed resulting provides distinctive information about correctness any compared The source code is available at https://github.com/gsaygili/dimred .

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Visualizing Data using t-SNE

We present a new technique called “t-SNE” that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map. The technique is a variation of Stochastic Neighbor Embedding (Hinton and Roweis, 2002) that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map. ...

متن کامل

Fast Optimization for t-SNE

The paper presents an alternative optimization technique for t-SNE that is orders of magnitude faster than the original optimization technique, and that produces results that are at least as good.

متن کامل

Accelerating t-SNE using tree-based algorithms

The paper investigates the acceleration of t-SNE—an embedding technique that is commonly used for the visualization of high-dimensional data in scatter plots—using two treebased algorithms. In particular, the paper develops variants of the Barnes-Hut algorithm and of the dual-tree algorithm that approximate the gradient used for learning t-SNE embeddings in O(N logN). Our experiments show that ...

متن کامل

Supplemental Material for Visualizing Data using t - SNE

In this supplementary material, we present the results of our experiments that compare the visualizations produced by t-SNE with those produced by seven other dimensionality reduction techniques on five datasets from a variety of domains. Some of these results were already presented in the paper, however, we present the results here in a different form. The five datasets we employed in our expe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Machine Learning and Cybernetics

سال: 2022

ISSN: ['1868-8071', '1868-808X']

DOI: https://doi.org/10.1007/s13042-022-01635-2